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(57) ABSTRACT 

An improved cache memory and method of operation 
thereof. The cache memory includes a doubly-linked loop of 
cache lines and a single pointer operable to address a cache 
line in the doubly-linked loop. In the cache memory, the 
pointer is preferably operable to address a next cache line in 
the doubly-linked loop, or a previous cache line in the 
doubly-linked loop. The cache memory as described permits 
a reduction in the number of instruction steps involved in 
controlling the cache lines. The improved cache memory 
may be implemented in a data processing system or within 
a computer program product. 

14 Claims, 2 Drawing Sheets 
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CACHE MEMORY SYSTEM AND METHOD references. A more sophisticated technique makes use of a 

UTILIZING DOUBLY-LINKED LOOP OF doubly-linked list, in which each element except the last 

CACHE LINES AND A SINGLE POINTER TO contains a pointer forward to the next element, and each 

ADDRESS A CACHE LINE IN THE DOUBLY- element except the first contains a pointer backward to the 

LINKED LOOP 5 previous element. The process of locating an element in such 

a structure can thus proceed either forwards or backwards 
along the chain of pointer references. 

BACKGROUND OF THE INVENTION An attempt to solve the problems of cache management 

1. Technical Field of the Invention in usin S nash tables and a least-recently-used cache line evic- 
m . . . „ . ( 30 tion technique is disclosed in U.S. Pat. No. 5,778,430, which 
The present invention is concerned with a system, appa- also descfibes me ^ of linked ^ ^ cifcular Hnked ^ 

ratus and method for controlling cache memory in a data lQ ^ lhe efficiency of searching for the least-recently- 
processing system, and in particular to the provision of used cacne ij ne 

improved cache addressing in a data processing system. The " , . „ . . , , , 

present invention is of general applicability in data process- 15 ** ^hmques of organmng and addressing 

ing systems, and particularly where the speed at which data caches . have disadvantages in terms of the numbers of 

can be retrieved is of concern. instructions required to process them Caches addressed 

. . using set-associative techniques and hash tables can still 

2. Description of the Related Art haye problems m terms of the number of instructions 

In conventional computer systems, instructions and data required to process insertions, deletions and the initial 

are stored in main storage and fetched from main storage by 20 addressing of a cache line. The known cache arrangements, 

a memory management system for execution or use by a S uch as linked lists and doubly-linked lists, require extra 

central processor unit, or possibly by some special function instructions to handle the various special cases, such as the 

unit, such as a floating-point processor. In some systems, case 0 f an empty list, or the case of a deletion of a last cache 

some instructions and data may be retained after their use in ]j ne from a list, 

a cache memory which can be accessed more quickly than 25 

the main storage, so that such instructions and data can be SUMMARY OF THE INVENTION 

reused later in the execution of the same program. This , . . * * 

improves the execution performance of the computer system ! f 15 there ^ e an ob J ect of the P resent inventlon t0 P rovide 

by reducing the time taken to fetch the instructions and data an un P roved data Pressing system. 

for processing by the central processing unit. 30 li ^ another object of the present mvention to provide an 

In systems having caching, the number of cycles taken to im P roved cache memor ? and method of °P eration thereof ' 

retrieve an instruction or a data item depends on whether the To achieve the foregoing objects, and in accordance with 

instruction or data item is already in the cache or not, and on tne invention as embodied and broadly described herein, an 

how many instructions are required to address or retrieve the improved cache memory is disclosed. The cache memory 

instruction or data item. If the instruction or data item is not 35 includes a doubly-linked loop of cache lines and a single 

in the cache (a "cache miss"), the instruction or data item poin ter operable to address a cache line in the doubly-linked 

must be fetched from main memory, which consumes some loo P- A doubly-linked loop, also sometimes called a circular 

number of instruction cycles. If the data item or instruction doubly-linked list, is advantageously utilized to provide a set 

is in the cache, some instruction cycles will also be of efficient primitive operations for addressing and manipu- 

consumed, although they will be fewer than in the case of a 40 la ting the cache lines. 

cache miss. Nevertheless, any improvement that can be In one embodiment of the present invention, the pointer is 

made in the processing of cached data and instructions is operable to address a next cache line in the doubly-linked 

useful, and, in certain circumstances, may make a consid- loop. Alternatively, in another advantageous embodiment, 

erable difference to the processing performance of the the pointer is operable to address a previous cache line in 

system. 45 said doubly-linked loop. In a related embodiment, the 

Improvements in cache memory performance have been pointer is stored in a register, 

sought utilizing various methods of linking and associating In another embodiment of the present invention, the cache 

groups of cache lines. One example is the use of set- memory further includes cache lines having address data, a 

associative caching, wherein each cache line is placed in a 5Q dirty marker and an empty marker arranged as a singly- 

logically appropriate set, and the addressing mechanism loadable unit. 

then locates first the set, and then the individual cache line In a second aspect of the present invention, a data 

within that set. In caches comprising simple set-associative processing system includes a processor, a main memory and 

mechanisms based on addressing, it is not necessary to store at least one cache memory having a doubly-linked loop of 

the full address in each cache line; part of the address can be J5 cache lines and a single pointer operable to address a cache 

deduced from the set association itself. line in the doubly-linked loop. 

Another technique frequently used is a hash table. A hash In one embodiment of the present invention, the data 

table is, in effect, an abbreviated index to the cache lines, processing system further includes a plurality of cache 

which reduces the average time taken in searching for an memories and a hash table of entries for addressing the 

entry or in determining that the data is not present in the 60 plurality of cache memories. Alternatively, in another advan- 

cache and will therefore need to be fetched from main tageous embodiment, the data processing system further 

memory. includes a plurality of cache memories and a set associative 

There are also various techniques for arranging the con- mechanism for addressing the plurality of cache memories, 

tents of a cache memory. For example, the cache lines may 1° a related embodiment, the data processing system further 

be arranged as a linked list, in which each element except the 65 includes a pointer stored in a register of the processor, 

last contains a pointer forward to the next element. An In a third aspect of the present invention, a method for 

element can thus be located by following the chain of pointer implementing a cache memory is disclosed. The method 
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includes utilizing a doubly-linked loop of cache lines and 
employing a single pointer operable to address a cache line 
in the doubly-linked loop. To purge a cache line, the method 
further includes marking a current cache line as empty and 
clean. Next, a pointer is employed to point to a next cache 
line, To load an item into the cache line, the method further 
includes pointing a pointer to a cache line at a least-recently- 
used position. Thereafter, the item to be loaded is loaded 
from a memory into the cache line. 

The present invention advantageously utilizes the simpli- 
fied pointer manipulation operations available to a doubly- 
linked loop to give an improvement in code speed and 
compactness and to reduce code complexity. The present 
invention does not require the special-case processing that is 
necessary in typical previously-existing solutions to the 
problems of cache memory control. 

The advantage of a loop over a list is the increased 
symmetry. With a list, it is necessary to do special processing 
when you reach an end, whereas loops do not have ends, and 
thus the special processing is not necessary. The anchor 
structure for a doubly linked loop is a simple pointer, for 
example, to the head position. The anchor structure for a 
doubly-linked list is two pointers — one to the head position 
and one to the tail position. The extra pointer is necessary 
because the "prev" pointer of the head element and the 
"next" pointer of the tail element are not used to hold useful 
information. 

With a loop it is possible to step from the "tail" to the 
"head" without additional overhead. With a doubly-linked 
loop it is possible to go either way. Hence it is easy to make 
the "tail" element the new "head" element merely by step- 
ping the "head pointer" one step back. It is not necessary to 
physically remove the tail element and prepend it to the head 
(that is, it is not necessary to break and remake any of the 
links). Similarly, it is possible to logically move the head 
element to the tail position by a simple forward step of the 
"head pointer" 

It typically costs one processor instruction to step a 
pointer one position around the loop. The cost of removing 
an element and reinserting it elsewhere in the list is consid- 
erably higher. 

Doubly-linked loops also have the same advantage as 
doubly-linked lists: it is easy to remove an element from the 
list. However, the overhead for adding an element to the list 
is higher than for a single- linked loop because there are more 
link pointers to update. 

The present invention organizes cache lines into a doubly- 
linked loop structure in such a way that the single step 
primitive is efficiently used. Examples include promoting an 
unused (or LRU but occupied) cache line to the head 
position prior to filling it, or demoting a newly purged cache 
line from the head position. Additionally because the anchor 
structure is a single pointer, it is easier to hold it in a register 
within the processor for efficient operation. 

Additionally the invention makes use of certain special- 
izations of the double linked loop, most notably that there is 
more than one cache line within the cache. Thus, when 
moving a random cache line to the head position, it can be 
assumed that the loop will be non-empty after the removal 
of the cache line from its original position prior to insertion 
into its new position. This removes the need for code to 
handle the empty case, making the code simpler and faster. 

The foregoing description has outlined, rather broadly, 
preferred and alternative features of the present invention so 
that those skilled in the art may better understand the 
detailed description of the invention that follows. Additional 
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features of the invention will be described hereinafter that 
form the subject matter of the claims of the invention. Those 
skilled in the art should appreciate that they can readily use 
the disclosed conception and specific embodiment as a basis 
5 for designing or modifying other structures for carrying out 
the same purposes of the present invention. Those skilled in 
the art should also realize that such equivalent constructions 
do not depart from the spirit and scope of the invention in its 
broadest form. 

10 

BRIEF DESCRIPTION OF THE DRAWINGS 

A preferred embodiment of the present invention will now 
be described by way of example, with reference to the 
accompanying drawings, in which: 

FIG. 1 is a block diagram of an exemplary computer 
system having a cache memory according to the present 
invention; and 

FIG. 2 is a detailed diagram of an embodiment of a cache 
20 memory according to the present invention. 

DETAILED DESCRIPTION 

In FIG. 1, a computer system 101 includes a processor 
102 and a storage 103, which may represent main memory 

25 or an external data storage device, such as disk storage, 
optical storage, tape storage or similar storage devices. The 
computer system also includes a cache memory 104 having 
a plurality of cache lines, designated 105, 106 and 107. Only 
three of the plurality of cache lines are shown; in practice 

30 there may be a large number. In the illustrated figure, cache 
line 105 represents the most-recently-used (MRU) cache 
line, and cache line 107 represents the least-recently-used 
(LRU) cache line. As in any computer system having cache 
memory, the processor 102 may make a request for data. The 

35 cache memory mechanism is capable of searching the con- 
tents of the cache memory for the presence of a particular 
requested data item. The various searching means are well- 
known in the art, and may include the use of hash tables, 
set-associative searching means, and the like. If the search- 

40 ing means determines that the required data item is not 
contained in the cache memory, the computer system 
retrieves the data item from storage 103. 

Referring now to FIG. 2, there is depicted an embodiment 

45 of a cache memory according to the present invention in 
which a pointer 201 points to a most-recently-used cache 
line 202. Each cache line 202, 203, 204, 205, 206, 207 has 
associated forward (next) and backward (prev) pointers to 
the respective next and previous cache lines. It can be seen 

5Q in the illustrated figure that the pointer operations that set the 
pointer to "next" or "prev" have the effect of, as it were, 
"rotating" the doubly-linked loop in either a counterclock- 
wise or a clockwise direction. 
The cache comprises a single pointer "ptr" 201 and a 

55 doubly-linked loop (that is, a doubly-linked list connected as 
a loop) of a number of cache lines. The pointer "ptr" 201 is 
the fundamental base pointer for accessing the data in the 
cache line. Doubly-linked loops are well-adapted for the 
removal and insertion of list elements from or to any 

60 arbitrary position in the list. The doubly-linked loop is a very 
symmetrical structure with no start or end, which eliminates 
certain special case tests for the boundary conditions. Dur- 
ing operation the loop is never empty, which again simplifies 
the code because the empty case never arises 

65 The system implements a strict Least Recently Used 
(LRU) cache: if it is necessary to flush a cache line to make 
way for a new entry, the line that is flushed will be the one 
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that was least recently used. In the illustrated embodiment, 3. Flush the current cache line. If the cache line is full and 

each cache line contains the following information: dirty, then write the data it contains back to external 

1. A pointer to the previous entry in the loop — "prev". storage. Purge the current cache line as above. 

2. A pointer to the next entry in the loop — "next". 4. Get an entry into the cache. Search the cache sequen- 

3. A boolean flag to indicate whether the cache line is 5 tially in the "next" direction starting at the MRU 
empty— "empty". If the cache line is not empty it is position until one of the following occurs: 

said to be full. Another possible representation of this a) The entry is found in the cache ("cache-hit"), in 

flag is the address being set to a reserved value, for which case we make that cache line the current cache 

example 0. line as above. 

4. A boolean flag to indicate whether that cache line is 10 b) An empty cache line is found ("cache-miss"), in 
dirty — " dirty". A cache line is dirty if its contents have which case we load the entry as a new item into the 
been updated locally and hence may not match the cache as below. In this case, it is not necessary to 
corresponding data in external storage. If the cache line flush the cache line in the LRU position. 

is not dirty, it is said to be clean. c) The search process has iterated right round the loop 

5. The external address of the cached item — "address". 15 ("cache-miss"), in which case we load the entry as a 

6. The data for the cached item — "data". The contents of new item into the cache as below. In this case, it is 
data are application dependent. necessary to flush the cache line in the LRU position. 

An additional optimization may be obtained by combin- 5. Load a new item into the cache (cache-miss). Perform 

ing items 3, 4 and 5 from the above list in a single loadable the assignment "ptr=ptr->prev", which has the effect of 

and writeable element, or "word". This advantageously 20 moving the cache line at the LRU position to the 

permits the information to be compressed and exploits the current position, and moving the other cache lines 

difference in processing time taken to perform the initial clockwise one position. If the current cache line is full 

load of a word containing an address and the time taken to and dirty, write the data it contains back to external 

perform subsequent processing on the same word: the initial storage. This is the case where the cache is completely 

loading is slower than subsequent processing using, for 2 5 full so the system frees up the LRU cache line for reuse, 

example, masking and comparison instructions. Thus a Load the relevant data from external storage to the 

single load, in this case, makes the three items of informa- current cache line. Update the address of the current 

tion available at a lower processing cost than would be the cache line. Mark the current cache line clean. Mark the 

case if they were stored as separate items. Similarly current cache line full. (In the case where the address, 

advantageously, a single store instruction can be used to 30 dirty marker and empty marker are combined in a 

write the three items of information. single word, these last three steps can be achieved using 

At initialization, each cache line is set to empty and clean, a single store instruction.) 

and the "prev" and "next" pointers initialized such that the 5. Flush the entire cache. While the current cache line is 

cache lines form a doubly-linked loop. The pointer "ptr" 201 flush the current cache line as above and iterate, 

is set to point to an arbitrarily chosen cache line. The pointer 35 xhe described embodiment is of particular usefulness in 

"ptr" 201 always points to the "current" entry in the cache. implementing a cache memory control mechanism in soft- 

This will either be empty (in which case the entire cache is ware . i n this area , a typical implementation might keep a 

empty), or by definition must be the most recently used number of different lists (for example, empty cache lines and 

"MRU" entry in the cache. Following the loop round in the cache lines could be kept on different lists), and the 

"next" direction leads to successively less recently used 40 software might have to go through a number of special case 

entries, and then 0 or more empty cache lines. The ordering checks. 

of the set of empty cache lines in the latter part of the loop i n t he present embodiment, the use of a doubly-linked 

is not important. FIG. 2 illustrates an exemplary state of the i oop provides a useful set of fast primitive operations that 

cache with a number of full lines and two empty tine. coincide well with the requirements for the implementation 

The construct "ptr->prev" is the address of either an 45 0 f a software cache. For example when a new cache line is 

empty cache line, or the least recently used "LRU" entry in required, the operation "ptr=ptr->prev" has the effect of 

the cache. The cache line pointed to by "ptr" is designated moving the referenced cache line into the current position at 

the most-recently-used (MRU) position and the cache line the same time as moving all the other cache lines one 

pointed to by "ptr->prev" is designated the least-recently- position clockwise. This operation makes the correct cache 

used (LRU) position, although the cache line at that position 50 line current, whether or not it is already full. In a typical 

may, in fact, be empty. The cache is empty if and only if the previouslynexisting solution, the correct cache line would 

cache line in the MRU position is empty. The cache is full either be the head element of the list of empty cache lines, 

if and only if the cache line in the LRU position is full. or otherwise the tail element of the list of full cache lines (if 

In the cache memory of the preferred embodiment, the trie list of empty cache lines is empty). The code would 

following operations are available: 55 therefore be slower. 

1. Make a "target" cache line the current cache line. If Similarly when a cache line is flushed or purged it would 
"target" is already the current line, then there is nothing be moved from the list of full cache lines to the list of empty 
to do. If it is not, remove "target" from its position in cache lines. In the present embodiment, the operation "ptr« 
the loop. Insert "target" into the loop before the current ptr->next" will achieve the corresponding operation more 
line pointed to by "ptr". Perform the assignment "ptr- 60 quickly. 

target",which has the effect of pointing "ptr" at "tar- The key control structure is a single pointer "ptr". In an 

g et "* advantageous embodiment, it is possible to hold this pointer 

2. Purge the current cache line. Mark the current cache permanently in a register internal to the processor for further 
line as empty and clean. Perform the assignment "ptr« increased speed of operation. A typical previously-existing 
ptr->next",which has the effect of moving the now 65 solution may maintain a number of structures such as lists, 
empty cache line to the LRU position; all the other the key control structures of which would not be held 
cache lines move counterclockwise one position. permanently in registers internal to the processor. 
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The list operations in a typical previously -existing solu- 
tion are slow because they contain conditional paths relating 
to the empty cases. For example, moving a cache line 
between lists would involve checks for the source list 
becoming empty and the destination list being originally 5 
empty. In the present invention there is only one list (a 
doubly-linked loop) which can never become empty so these 
cases do not arise. 

The present invention can be extended by known tech- 
niques. For example, if it is desired to build a fast large 10 
cache, a hash table may be inserted, each entry of which 
references an instance of the present invention. A cache 
operation then consists of identifying the correct hash table 
entry, followed by the relevant cache operation on the said 
entry according to the invention. is 

The present invention may be embodied in other specific 
forms without departing from its spirit or essential charac- 
teristics. The described embodiments are to be considered in 
all respects as illustrative and not restrictive. The scope of 
the invention is, therefore, indicated by the appended claims 20 
rather than by the foregoing description. All changes which 
come within the meaning and range of equivalency of the 
claims are to be embraced within their scope. 

What is claimed is: 

1. A cache memory comprising: 25 
a plurality of cache lines having address data, dirty marker 

and empty marker arranged as a singly loadable and 
writeable unit, wherein said plurality of cache lines is 
organized as a doubly-lined loop; and 
a single pointer operable to address a cache line in said 30 
doubly-linked loop. 

2. The cache memory as recited in claim 1, wherein said 
single pointer being operable to address a next cache line in 
said doubly-linked loop. 

3. The cache memory as recited in claim 1, wherein said 35 
single pointer being operable to address a previous cache 
line in said doubly-linked loop. 

4. The cache memory as recited in claim 1, wherein said 
single pointer is stored in a register. 

5. A data processing system comprising: 40 
a processor; 

a main memory; and 

at least one cache memory including: 

a plurality of cache lines having address data, dirty 45 
marker and empty marker arranged as a singly 
loadable and writeable unit, wherein said plurality of 
cache lines is organized as a doubly-linked loop; and 
a single pointer operable to address a cache line in said 
doubly- linked loop. 
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6. The data processing system as recited in claim 5, 
wherein said single pointer is stored in a register of said 
processor. 

7. The data processing system as recited in claim 5, 
further comprising a plurality of cache memories and a hash 
table of entries for addressing said plurality of cache a 
memories. 

8. The data processing system as recited in claim 5, 
further comprising a plurality of cache memories and a set 
associative mechanism for addressing said plurality of cache 
memories. 

9. A method of implementing a cache memory, compris- 
ing the steps of: 

utilizing a doubly-led loop of cache lines; 

employing a single pointer operable to address a cache 
line in said doubly-linked loop; and 

purging a cache line in said cache memory, including: 
marking a current cache line as empty and clean; and 
pointing said single pointer to a next cache line. 

10. The method as recited in claim 9, further comprising 
the step of loading an item into said cache memory, said step 
of loading an item including the steps: 

pointing said single pointer to a cache line at a least- 
recently-used position; and 
loading said item from a memory into said cache line. 

11. The method as recited in claim 9, further comprising 
the step of storing said single pointer in a register. 

12. A computer program product, comprising: 

a computer-readable recording medium having stored 
thereon computer executable instructions for imple- 
menting a cache memory, said computer executable 
instructions when executed, perform the steps of, 
utilizing a doubly-linked loop of cache lines; 
employing a single pointer operable to address a cache 

line in said doubly-linked loop; and 
purging a cache line including: 

marking a current cache line as empty and clean; and 
pointing said single pointer to a next cache line. 

13. The computer program product as recited in claim 12, 
further comprising the step of loading an item into said 
cache memory, said step of loading an item including the 
steps: 

pointing said single pointer to a cache line at a least- 
recently-used position, and 
loading said item from a memory into said cache line. 

14. The computer program product as recited in claim 12, 
further comprising the step of storing said single pointer in 
a register. 

***** 
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